Overview

Dataset statistics

Number of variables23
Number of observations362
Missing cells693
Missing cells (%)8.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory138.3 KiB
Average record size in memory391.4 B

Variable types

NUM17
CAT4
DATE1
UNSUPPORTED1

Reproduction

Analysis started2020-05-31 18:06:25.364377
Analysis finished2020-05-31 18:07:42.123325
Duration1 minute and 16.76 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

country_region_code has constant value "IN" Constant
country_region has constant value "India" Constant
transit_stations is highly correlated with retail_and_recreation and 2 other fieldsHigh correlation
retail_and_recreation is highly correlated with transit_stationsHigh correlation
grocery_and_pharmacy is highly correlated with transit_stationsHigh correlation
workplaces is highly correlated with transit_stations and 1 other fieldsHigh correlation
residential is highly correlated with workplacesHigh correlation
confirm_mv21 is highly correlated with confirm_mv14 and 1 other fieldsHigh correlation
confirm_mv14 is highly correlated with confirm_mv21High correlation
confirm_mv28 is highly correlated with confirm_mv21High correlation
state_code is highly correlated with state_codesHigh correlation
state_codes is highly correlated with state_codeHigh correlation
sub_region_2 has 362 (100.0%) missing values Missing
confirm_mv7 has 6 (1.7%) missing values Missing
confirm_mv14 has 13 (3.6%) missing values Missing
confirm_mv21 has 20 (5.5%) missing values Missing
confirm_mv28 has 27 (7.5%) missing values Missing
TMAX has 206 (56.9%) missing values Missing
TMIN has 57 (15.7%) missing values Missing
df_index has unique values Unique
sub_region_2 is an unsupported type, check if it needs cleaning or further analysis Unsupported
grocery_and_pharmacy has 5 (1.4%) zeros Zeros
parks has 6 (1.7%) zeros Zeros
confirm_cases has 18 (5.0%) zeros Zeros
confirm_mv3 has 4 (1.1%) zeros Zeros
PRCP has 332 (91.7%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct count362
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean180.5
Minimum0
Maximum361
Zeros1
Zeros (%)0.3%
Memory size3.0 KiB

Quantile statistics

Minimum0
5-th percentile18.05
Q190.25
median180.5
Q3270.75
95-th percentile342.95
Maximum361
Range361
Interquartile range (IQR)180.5

Descriptive statistics

Standard deviation104.6446367
Coefficient of variation (CV)0.57974868
Kurtosis-1.2
Mean180.5
Median Absolute Deviation (MAD)90.5
Skewness0
Sum65341
Variance10950.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
36110.3%
 
11310.3%
 
11510.3%
 
11610.3%
 
11710.3%
 
11810.3%
 
11910.3%
 
12010.3%
 
12110.3%
 
12210.3%
 
12310.3%
 
12410.3%
 
12510.3%
 
12610.3%
 
12710.3%
 
12810.3%
 
12910.3%
 
13010.3%
 
13110.3%
 
13210.3%
 
13310.3%
 
11410.3%
 
11210.3%
 
9010.3%
 
11110.3%
 
Other values (337)33793.1%
 
ValueCountFrequency (%) 
010.3%
 
110.3%
 
210.3%
 
310.3%
 
410.3%
 
510.3%
 
610.3%
 
710.3%
 
810.3%
 
910.3%
 
ValueCountFrequency (%) 
36110.3%
 
36010.3%
 
35910.3%
 
35810.3%
 
35710.3%
 
35610.3%
 
35510.3%
 
35410.3%
 
35310.3%
 
35210.3%
 

Date
Date

Distinct count61
Unique (%)16.9%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
Minimum2020-03-14 00:00:00
Maximum2020-05-13 00:00:00
Histogram

country_region_code
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
IN
362
ValueCountFrequency (%) 
IN362100.0%
 

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
I36250.0%
 
N36250.0%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter724100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
I36250.0%
 
N36250.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin724100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
I36250.0%
 
N36250.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII724100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
I36250.0%
 
N36250.0%
 

country_region
Categorical

CONSTANT
REJECTED

Distinct count1
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
India
362
ValueCountFrequency (%) 
India362100.0%
 

Length

Max length5
Median length5
Mean length5
Min length5

Overview of Unicode Properties

Unique unicode characters5
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
I36220.0%
 
n36220.0%
 
d36220.0%
 
i36220.0%
 
a36220.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter144880.0%
 
Uppercase Letter36220.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
I362100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n36225.0%
 
d36225.0%
 
i36225.0%
 
a36225.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1810100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
I36220.0%
 
n36220.0%
 
d36220.0%
 
i36220.0%
 
a36220.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1810100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
I36220.0%
 
n36220.0%
 
d36220.0%
 
i36220.0%
 
a36220.0%
 

state_codes
Categorical

HIGH CORRELATION

Distinct count6
Unique (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
KA
61
TN
61
MH
61
RJ
61
DL
61
ValueCountFrequency (%) 
KA6116.9%
 
TN6116.9%
 
MH6116.9%
 
RJ6116.9%
 
DL6116.9%
 
GJ5715.7%
 

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter724100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin724100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII724100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

sub_region_2
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing362
Missing (%)100.0%
Memory size3.0 KiB

retail_and_recreation
Real number (ℝ)

HIGH CORRELATION

Distinct count51
Unique (%)14.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.34530386740332
Minimum-91.0
Maximum-4.0
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum-91
5-th percentile-89
Q1-86
median-82
Q3-75
95-th percentile-12.05
Maximum-4
Range87
Interquartile range (IQR)11

Descriptive statistics

Standard deviation23.01979378
Coefficient of variation (CV)-0.3138550469
Kurtosis2.617082152
Mean-73.34530387
Median Absolute Deviation (MAD)4
Skewness2.033815623
Sum-26551
Variance529.9109059
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-82359.7%
 
-83298.0%
 
-87267.2%
 
-86256.9%
 
-81205.5%
 
-85205.5%
 
-74205.5%
 
-80185.0%
 
-84164.4%
 
-79154.1%
 
-89154.1%
 
-88133.6%
 
-78123.3%
 
-90102.8%
 
-7592.5%
 
-7771.9%
 
-9151.4%
 
-1741.1%
 
-7341.1%
 
-941.1%
 
-7641.1%
 
-841.1%
 
-1841.1%
 
-1241.1%
 
-730.8%
 
Other values (26)369.9%
 
ValueCountFrequency (%) 
-9151.4%
 
-90102.8%
 
-89154.1%
 
-88133.6%
 
-87267.2%
 
-86256.9%
 
-85205.5%
 
-84164.4%
 
-83298.0%
 
-82359.7%
 
ValueCountFrequency (%) 
-410.3%
 
-510.3%
 
-730.8%
 
-841.1%
 
-941.1%
 
-1010.3%
 
-1110.3%
 
-1241.1%
 
-1320.6%
 
-1410.3%
 

grocery_and_pharmacy
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct count88
Unique (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-46.20165745856354
Minimum-84.0
Maximum21.0
Zeros5
Zeros (%)1.4%
Memory size3.0 KiB

Quantile statistics

Minimum-84
5-th percentile-73
Q1-64
median-52
Q3-37
95-th percentile1
Maximum21
Range105
Interquartile range (IQR)27

Descriptive statistics

Standard deviation22.91549446
Coefficient of variation (CV)-0.4959885797
Kurtosis0.07542951494
Mean-46.20165746
Median Absolute Deviation (MAD)12
Skewness0.9881943451
Sum-16725
Variance525.1198864
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-66185.0%
 
-52174.7%
 
-64154.1%
 
-51143.9%
 
-53113.0%
 
-65113.0%
 
-63113.0%
 
-61113.0%
 
-57102.8%
 
-50102.8%
 
-4482.2%
 
-6782.2%
 
-6882.2%
 
-5482.2%
 
-182.2%
 
-5971.9%
 
-4971.9%
 
361.7%
 
-4651.4%
 
-4551.4%
 
-3751.4%
 
-4351.4%
 
-4151.4%
 
-5851.4%
 
-3551.4%
 
Other values (63)13938.4%
 
ValueCountFrequency (%) 
-8410.3%
 
-8310.3%
 
-8110.3%
 
-7910.3%
 
-7810.3%
 
-7741.1%
 
-7641.1%
 
-7510.3%
 
-7430.8%
 
-7330.8%
 
ValueCountFrequency (%) 
2110.3%
 
1710.3%
 
1110.3%
 
1010.3%
 
910.3%
 
810.3%
 
720.6%
 
520.6%
 
410.3%
 
361.7%
 

parks
Real number (ℝ)

ZEROS

Distinct count79
Unique (%)21.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-57.9585635359116
Minimum-98.0
Maximum3.0
Zeros6
Zeros (%)1.7%
Memory size3.0 KiB

Quantile statistics

Minimum-98
5-th percentile-95.95
Q1-74
median-62.5
Q3-45.25
95-th percentile-7
Maximum3
Range101
Interquartile range (IQR)28.75

Descriptive statistics

Standard deviation24.2518171
Coefficient of variation (CV)-0.4184337158
Kurtosis0.1238372169
Mean-57.95856354
Median Absolute Deviation (MAD)12.5
Skewness0.7800473611
Sum-20981
Variance588.1506328
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-75154.1%
 
-69143.9%
 
-74143.9%
 
-98143.9%
 
-71133.6%
 
-61123.3%
 
-76123.3%
 
-78123.3%
 
-77102.8%
 
-7392.5%
 
-7992.5%
 
-5592.5%
 
-4182.2%
 
-5882.2%
 
-5082.2%
 
-7082.2%
 
-9582.2%
 
-7271.9%
 
-5971.9%
 
-6061.7%
 
-5361.7%
 
-6861.7%
 
-5761.7%
 
-5461.7%
 
-4061.7%
 
Other values (54)12935.6%
 
ValueCountFrequency (%) 
-98143.9%
 
-9710.3%
 
-9641.1%
 
-9582.2%
 
-9410.3%
 
-8120.6%
 
-8010.3%
 
-7992.5%
 
-78123.3%
 
-77102.8%
 
ValueCountFrequency (%) 
310.3%
 
210.3%
 
110.3%
 
061.7%
 
-120.6%
 
-210.3%
 
-341.1%
 
-520.6%
 
-720.6%
 
-810.3%
 

transit_stations
Real number (ℝ)

HIGH CORRELATION

Distinct count70
Unique (%)19.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-60.74585635359116
Minimum-88.0
Maximum1.0
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum-88
5-th percentile-84
Q1-74
median-66
Q3-58
95-th percentile-10
Maximum1
Range89
Interquartile range (IQR)16

Descriptive statistics

Standard deviation21.10313802
Coefficient of variation (CV)-0.3474004531
Kurtosis1.234584811
Mean-60.74585635
Median Absolute Deviation (MAD)8
Skewness1.41550628
Sum-21990
Variance445.3424343
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-72174.7%
 
-71164.4%
 
-62164.4%
 
-67154.1%
 
-68154.1%
 
-61143.9%
 
-75143.9%
 
-70123.3%
 
-60123.3%
 
-64113.0%
 
-66102.8%
 
-63102.8%
 
-59102.8%
 
-8492.5%
 
-5892.5%
 
-7692.5%
 
-6971.9%
 
-6571.9%
 
-8371.9%
 
-8261.7%
 
-761.7%
 
-7761.7%
 
-8761.7%
 
-1561.7%
 
-7351.4%
 
Other values (45)10729.6%
 
ValueCountFrequency (%) 
-8810.3%
 
-8761.7%
 
-8651.4%
 
-8551.4%
 
-8492.5%
 
-8371.9%
 
-8261.7%
 
-8141.1%
 
-8051.4%
 
-7951.4%
 
ValueCountFrequency (%) 
110.3%
 
-210.3%
 
-310.3%
 
-420.6%
 
-541.1%
 
-761.7%
 
-820.6%
 
-910.3%
 
-1020.6%
 
-1120.6%
 

workplaces
Real number (ℝ)

HIGH CORRELATION

Distinct count78
Unique (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-57.94198895027624
Minimum-85.0
Maximum5.0
Zeros2
Zeros (%)0.6%
Memory size3.0 KiB

Quantile statistics

Minimum-85
5-th percentile-82
Q1-73.75
median-66
Q3-49.25
95-th percentile-5
Maximum5
Range90
Interquartile range (IQR)24.5

Descriptive statistics

Standard deviation22.37676512
Coefficient of variation (CV)-0.3861925613
Kurtosis0.8035246619
Mean-57.94198895
Median Absolute Deviation (MAD)9
Skewness1.309077748
Sum-20975
Variance500.7196171
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-74215.8%
 
-73154.1%
 
-69154.1%
 
-71154.1%
 
-67123.3%
 
-70123.3%
 
-75123.3%
 
-72113.0%
 
-81102.8%
 
-82102.8%
 
-6892.5%
 
-6292.5%
 
-7792.5%
 
-6092.5%
 
-5792.5%
 
-7682.2%
 
-4482.2%
 
-5571.9%
 
-6171.9%
 
-6461.7%
 
-561.7%
 
-6561.7%
 
-4751.4%
 
-4651.4%
 
-8451.4%
 
Other values (53)12133.4%
 
ValueCountFrequency (%) 
-8520.6%
 
-8451.4%
 
-8330.8%
 
-82102.8%
 
-81102.8%
 
-8030.8%
 
-7930.8%
 
-7851.4%
 
-7792.5%
 
-7682.2%
 
ValueCountFrequency (%) 
510.3%
 
210.3%
 
141.1%
 
020.6%
 
-110.3%
 
-230.8%
 
-310.3%
 
-420.6%
 
-561.7%
 
-610.3%
 

residential
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count36
Unique (%)9.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.83977900552486
Minimum0.0
Maximum39.0
Zeros1
Zeros (%)0.3%
Memory size3.0 KiB

Quantile statistics

Minimum0
5-th percentile4
Q123
median31
Q334
95-th percentile36
Maximum39
Range39
Interquartile range (IQR)11

Descriptive statistics

Standard deviation9.974303158
Coefficient of variation (CV)0.3716238929
Kurtosis0.5589495133
Mean26.83977901
Median Absolute Deviation (MAD)5
Skewness-1.233335468
Sum9716
Variance99.4867235
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
353910.8%
 
34359.7%
 
36328.8%
 
33298.0%
 
25256.9%
 
32205.5%
 
26174.7%
 
24174.7%
 
37143.9%
 
21143.9%
 
31143.9%
 
23113.0%
 
392.5%
 
2882.2%
 
471.9%
 
3071.9%
 
771.9%
 
261.7%
 
2961.7%
 
2261.7%
 
1951.4%
 
551.4%
 
851.4%
 
2041.1%
 
1741.1%
 
Other values (11)164.4%
 
ValueCountFrequency (%) 
010.3%
 
120.6%
 
261.7%
 
392.5%
 
471.9%
 
551.4%
 
771.9%
 
851.4%
 
1030.8%
 
1110.3%
 
ValueCountFrequency (%) 
3920.6%
 
3810.3%
 
37143.9%
 
36328.8%
 
353910.8%
 
34359.7%
 
33298.0%
 
32205.5%
 
31143.9%
 
3071.9%
 

confirm_cases
Real number (ℝ≥0)

ZEROS

Distinct count199
Unique (%)55.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean156.45027624309392
Minimum0.0
Maximum1943.0
Zeros18
Zeros (%)5.0%
Memory size3.0 KiB

Quantile statistics

Minimum0
5-th percentile1
Q19
median54
Q3173
95-th percentile677.55
Maximum1943
Range1943
Interquartile range (IQR)164

Descriptive statistics

Standard deviation264.4502352
Coefficient of variation (CV)1.690314914
Kurtosis12.09451051
Mean156.4502762
Median Absolute Deviation (MAD)50
Skewness3.117182868
Sum56635
Variance69933.92688
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0185.0%
 
4154.1%
 
3113.0%
 
192.5%
 
682.2%
 
782.2%
 
571.9%
 
1271.9%
 
871.9%
 
1361.7%
 
1061.7%
 
261.7%
 
961.7%
 
1761.7%
 
7651.4%
 
1841.1%
 
1441.1%
 
9841.1%
 
11030.8%
 
2330.8%
 
5130.8%
 
10230.8%
 
2530.8%
 
9330.8%
 
10630.8%
 
Other values (174)20456.4%
 
ValueCountFrequency (%) 
0185.0%
 
192.5%
 
261.7%
 
3113.0%
 
4154.1%
 
571.9%
 
682.2%
 
782.2%
 
871.9%
 
961.7%
 
ValueCountFrequency (%) 
194310.3%
 
156710.3%
 
149510.3%
 
123310.3%
 
123010.3%
 
121610.3%
 
116510.3%
 
108910.3%
 
102610.3%
 
100810.3%
 

confirm_mv3
Real number (ℝ≥0)

ZEROS

Distinct count265
Unique (%)73.6%
Missing2
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean162.58333333333331
Minimum0.0
Maximum1825.6666666666667
Zeros4
Zeros (%)1.1%
Memory size3.0 KiB

Quantile statistics

Minimum0
5-th percentile1.983333333
Q110
median65.16666667
Q3186
95-th percentile659.9
Maximum1825.666667
Range1825.666667
Interquartile range (IQR)176

Descriptive statistics

Standard deviation265.3814022
Coefficient of variation (CV)1.632279255
Kurtosis10.4737803
Mean162.5833333
Median Absolute Deviation (MAD)59
Skewness2.96884775
Sum58530
Variance70427.28861
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6.66666666761.7%
 
2.66666666751.4%
 
5.66666666751.4%
 
0.333333333341.1%
 
041.1%
 
241.1%
 
741.1%
 
11.3333333341.1%
 
2.33333333341.1%
 
941.1%
 
4.33333333341.1%
 
3.33333333341.1%
 
141.1%
 
7.66666666741.1%
 
0.666666666730.8%
 
6.33333333330.8%
 
5.33333333330.8%
 
630.8%
 
7030.8%
 
430.8%
 
16.3333333330.8%
 
129.333333320.6%
 
100.666666720.6%
 
141.333333320.6%
 
43.6666666720.6%
 
Other values (240)27174.9%
 
ValueCountFrequency (%) 
041.1%
 
0.333333333341.1%
 
0.666666666730.8%
 
141.1%
 
1.33333333320.6%
 
1.66666666710.3%
 
241.1%
 
2.33333333341.1%
 
2.66666666751.4%
 
310.3%
 
ValueCountFrequency (%) 
1825.66666710.3%
 
144610.3%
 
1399.66666710.3%
 
139910.3%
 
1261.33333310.3%
 
1250.33333310.3%
 
1179.33333310.3%
 
1156.66666710.3%
 
1144.33333310.3%
 
1076.33333310.3%
 

confirm_mv7
Real number (ℝ≥0)

MISSING

Distinct count312
Unique (%)87.6%
Missing6
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean174.30577849117176
Minimum0.42857142857142855
Maximum2218.714285714286
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum0.4285714286
5-th percentile3.142857143
Q114.10714286
median73.85714286
Q3210.5357143
95-th percentile666
Maximum2218.714286
Range2218.285714
Interquartile range (IQR)196.4285714

Descriptive statistics

Standard deviation285.4354687
Coefficient of variation (CV)1.637555973
Kurtosis14.90047572
Mean174.3057785
Median Absolute Deviation (MAD)64.5
Skewness3.392896633
Sum62052.85714
Variance81473.40678
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.85714285751.4%
 
5.71428571441.1%
 
4.42857142930.8%
 
5.85714285730.8%
 
6.42857142930.8%
 
22.5714285730.8%
 
7230.8%
 
3.14285714330.8%
 
8.42857142920.6%
 
220.6%
 
7.85714285720.6%
 
3.42857142920.6%
 
10.1428571420.6%
 
720.6%
 
95.4285714320.6%
 
9320.6%
 
73.8571428620.6%
 
18.4285714320.6%
 
14.8571428620.6%
 
8.71428571420.6%
 
64.1428571420.6%
 
420.6%
 
14.4285714320.6%
 
255.714285720.6%
 
4.28571428620.6%
 
Other values (287)29581.5%
 
(Missing)61.7%
 
ValueCountFrequency (%) 
0.428571428610.3%
 
0.714285714310.3%
 
110.3%
 
1.14285714310.3%
 
1.57142857110.3%
 
220.6%
 
2.14285714310.3%
 
2.42857142920.6%
 
2.57142857110.3%
 
2.85714285751.4%
 
ValueCountFrequency (%) 
2218.71428610.3%
 
191010.3%
 
157510.3%
 
1313.85714310.3%
 
1309.14285710.3%
 
1271.71428610.3%
 
1265.71428610.3%
 
115510.3%
 
1133.14285710.3%
 
1079.57142910.3%
 

confirm_mv14
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count331
Unique (%)94.8%
Missing13
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean191.55812525583298
Minimum2.7142857142857144
Maximum2017.142857142857
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum2.714285714
5-th percentile5.414285714
Q121
median94.35714286
Q3229.4285714
95-th percentile768.2714286
Maximum2017.142857
Range2014.428571
Interquartile range (IQR)208.4285714

Descriptive statistics

Standard deviation296.605435
Coefficient of variation (CV)1.548383472
Kurtosis13.36954048
Mean191.5581253
Median Absolute Deviation (MAD)77.85714286
Skewness3.313654571
Sum66853.78571
Variance87974.78409
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
8.85714285720.6%
 
168.142857120.6%
 
1320.6%
 
11.3571428620.6%
 
24.6428571420.6%
 
10.1428571420.6%
 
9.28571428620.6%
 
65.1428571420.6%
 
17.3571428620.6%
 
9.57142857120.6%
 
111.142857120.6%
 
8.07142857120.6%
 
5.35714285720.6%
 
19.9285714320.6%
 
43.520.6%
 
3.92857142920.6%
 
64.7142857120.6%
 
7920.6%
 
98.9285714310.3%
 
4.57142857110.3%
 
101.510.3%
 
1143.35714310.3%
 
101.642857110.3%
 
102.071428610.3%
 
102.642857110.3%
 
Other values (306)30684.5%
 
(Missing)133.6%
 
ValueCountFrequency (%) 
2.71428571410.3%
 
2.85714285710.3%
 
2.92857142910.3%
 
310.3%
 
3.35714285710.3%
 
3.510.3%
 
3.57142857110.3%
 
3.64285714310.3%
 
3.92857142920.6%
 
4.510.3%
 
ValueCountFrequency (%) 
2017.14285710.3%
 
1910.35714310.3%
 
1795.92857110.3%
 
1683.35714310.3%
 
1568.64285710.3%
 
1401.14285710.3%
 
1258.28571410.3%
 
1143.35714310.3%
 
1110.35714310.3%
 
1079.21428610.3%
 

confirm_mv21
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count329
Unique (%)96.2%
Missing20
Missing (%)5.5%
Infinite0
Infinite (%)0.0%
Mean209.8784461152882
Minimum4.523809523809524
Maximum1768.6666666666667
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum4.523809524
5-th percentile9.828571429
Q139.97619048
median101.9285714
Q3261.0357143
95-th percentile838.2
Maximum1768.666667
Range1764.142857
Interquartile range (IQR)221.0595238

Descriptive statistics

Standard deviation297.2514068
Coefficient of variation (CV)1.416302685
Kurtosis9.893702047
Mean209.8784461
Median Absolute Deviation (MAD)80.69047619
Skewness2.957246134
Sum71778.42857
Variance88358.39887
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
21.6190476230.8%
 
15.095238120.6%
 
55.2857142920.6%
 
65.1428571420.6%
 
12.3809523820.6%
 
6.09523809520.6%
 
18.3809523820.6%
 
102.523809520.6%
 
51.9523809520.6%
 
62.6666666720.6%
 
18.1904761920.6%
 
113.857142920.6%
 
742.190476210.3%
 
184.761904810.3%
 
211.714285710.3%
 
58510.3%
 
1452.28571410.3%
 
7.90476190510.3%
 
289.666666710.3%
 
557.666666710.3%
 
152.095238110.3%
 
88.3333333310.3%
 
581.285714310.3%
 
357.380952410.3%
 
166.714285710.3%
 
Other values (304)30484.0%
 
(Missing)205.5%
 
ValueCountFrequency (%) 
4.52380952410.3%
 
5.14285714310.3%
 
6.09523809520.6%
 
6.57142857110.3%
 
6.90476190510.3%
 
6.95238095210.3%
 
7.42857142910.3%
 
7.90476190510.3%
 
7.95238095210.3%
 
8.33333333310.3%
 
ValueCountFrequency (%) 
1768.66666710.3%
 
1709.95238110.3%
 
1652.04761910.3%
 
1600.19047610.3%
 
1544.71428610.3%
 
1452.28571410.3%
 
1393.95238110.3%
 
1345.42857110.3%
 
1274.42857110.3%
 
1198.71428610.3%
 

confirm_mv28
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count332
Unique (%)99.1%
Missing27
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean228.26673773987204
Minimum7.392857142857143
Maximum1548.1785714285713
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum7.392857143
5-th percentile14.98928571
Q157.66071429
median120.2857143
Q3273.3392857
95-th percentile861.1642857
Maximum1548.178571
Range1540.785714
Interquartile range (IQR)215.6785714

Descriptive statistics

Standard deviation292.8924497
Coefficient of variation (CV)1.283114888
Kurtosis7.339149978
Mean228.2667377
Median Absolute Deviation (MAD)82.10714286
Skewness2.629236548
Sum76469.35714
Variance85785.98709
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1320.6%
 
52.4285714320.6%
 
75.9285714320.6%
 
165.821428610.3%
 
8.96428571410.3%
 
10.7142857110.3%
 
14.6428571410.3%
 
1010.53571410.3%
 
1510.3%
 
14.9642857110.3%
 
342.142857110.3%
 
15.1428571410.3%
 
15.7510.3%
 
16.5357142910.3%
 
1046.89285710.3%
 
104.785714310.3%
 
17.7857142910.3%
 
105.857142910.3%
 
18.2857142910.3%
 
957.428571410.3%
 
19.510.3%
 
2210.3%
 
96.6071428610.3%
 
13.3928571410.3%
 
900.464285710.3%
 
Other values (307)30784.8%
 
(Missing)277.5%
 
ValueCountFrequency (%) 
7.39285714310.3%
 
7.46428571410.3%
 
8.07142857110.3%
 
8.57142857110.3%
 
8.96428571410.3%
 
9.46428571410.3%
 
10.7142857110.3%
 
12.2857142910.3%
 
1320.6%
 
13.3928571410.3%
 
ValueCountFrequency (%) 
1548.17857110.3%
 
1526.85714310.3%
 
1506.03571410.3%
 
1470.03571410.3%
 
1441.82142910.3%
 
1417.67857110.3%
 
1361.89285710.3%
 
132710.3%
 
1283.10714310.3%
 
1240.10714310.3%
 

state_code
Categorical

HIGH CORRELATION

Distinct count6
Unique (%)1.7%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
KA
61
TN
61
MH
61
RJ
61
DL
61
ValueCountFrequency (%) 
KA6116.9%
 
TN6116.9%
 
MH6116.9%
 
RJ6116.9%
 
DL6116.9%
 
GJ5715.7%
 

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter724100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin724100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII724100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
J11816.3%
 
T618.4%
 
N618.4%
 
M618.4%
 
H618.4%
 
D618.4%
 
L618.4%
 
R618.4%
 
K618.4%
 
A618.4%
 
G577.9%
 

TMAX
Real number (ℝ≥0)

MISSING

Distinct count67
Unique (%)42.9%
Missing206
Missing (%)56.9%
Infinite0
Infinite (%)0.0%
Mean355.80128205128204
Minimum256.0
Maximum426.0
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum256
5-th percentile319
Q1342
median352
Q3370.5
95-th percentile402.5
Maximum426
Range170
Interquartile range (IQR)28.5

Descriptive statistics

Standard deviation27.27977388
Coefficient of variation (CV)0.076671376
Kurtosis1.753832655
Mean355.8012821
Median Absolute Deviation (MAD)14
Skewness-0.141132967
Sum55505
Variance744.1860629
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
34692.5%
 
34482.2%
 
35271.9%
 
34561.7%
 
35661.7%
 
35461.7%
 
38051.4%
 
36051.4%
 
33851.4%
 
33641.1%
 
37041.1%
 
34741.1%
 
34041.1%
 
33230.8%
 
34330.8%
 
36630.8%
 
37230.8%
 
34230.8%
 
38230.8%
 
40220.6%
 
36420.6%
 
42020.6%
 
38620.6%
 
32220.6%
 
39620.6%
 
Other values (42)5314.6%
 
(Missing)20656.9%
 
ValueCountFrequency (%) 
25610.3%
 
27010.3%
 
27110.3%
 
29010.3%
 
30810.3%
 
31010.3%
 
31510.3%
 
31610.3%
 
32010.3%
 
32220.6%
 
ValueCountFrequency (%) 
42610.3%
 
42210.3%
 
42020.6%
 
41220.6%
 
40510.3%
 
40410.3%
 
40220.6%
 
40020.6%
 
39910.3%
 
39810.3%
 

TMIN
Real number (ℝ≥0)

MISSING

Distinct count109
Unique (%)35.7%
Missing57
Missing (%)15.7%
Infinite0
Infinite (%)0.0%
Mean231.04262295081966
Minimum124.0
Maximum294.0
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum124
5-th percentile169
Q1208
median234
Q3258
95-th percentile276
Maximum294
Range170
Interquartile range (IQR)50

Descriptive statistics

Standard deviation34.33390826
Coefficient of variation (CV)0.1486042178
Kurtosis-0.1567280481
Mean231.042623
Median Absolute Deviation (MAD)25
Skewness-0.572689894
Sum70468
Variance1178.817256
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
23092.5%
 
22692.5%
 
21092.5%
 
23292.5%
 
26471.9%
 
19671.9%
 
24071.9%
 
27671.9%
 
25871.9%
 
27061.7%
 
20661.7%
 
24561.7%
 
24661.7%
 
24451.4%
 
23451.4%
 
24151.4%
 
26851.4%
 
22051.4%
 
25051.4%
 
20241.1%
 
27441.1%
 
26041.1%
 
18041.1%
 
22441.1%
 
22141.1%
 
Other values (84)15643.1%
 
(Missing)5715.7%
 
ValueCountFrequency (%) 
12410.3%
 
13010.3%
 
13610.3%
 
14010.3%
 
14810.3%
 
15110.3%
 
15610.3%
 
15710.3%
 
15910.3%
 
16210.3%
 
ValueCountFrequency (%) 
29410.3%
 
29210.3%
 
29010.3%
 
28710.3%
 
28610.3%
 
28520.6%
 
28410.3%
 
28030.8%
 
27910.3%
 
27820.6%
 

PRCP
Real number (ℝ≥0)

ZEROS

Distinct count15
Unique (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.81767955801105
Minimum0.0
Maximum333.0
Zeros332
Zeros (%)91.7%
Memory size3.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile20
Maximum333
Range333
Interquartile range (IQR)0

Descriptive statistics

Standard deviation32.95115023
Coefficient of variation (CV)5.663967893
Kurtosis68.61345343
Mean5.817679558
Median Absolute Deviation (MAD)0
Skewness7.966133005
Sum2106
Variance1085.778302
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
033291.7%
 
551.4%
 
2041.1%
 
3041.1%
 
8941.1%
 
1020.6%
 
6120.6%
 
4120.6%
 
30710.3%
 
33310.3%
 
23110.3%
 
30010.3%
 
7110.3%
 
5110.3%
 
810.3%
 
ValueCountFrequency (%) 
033291.7%
 
551.4%
 
810.3%
 
1020.6%
 
2041.1%
 
3041.1%
 
4120.6%
 
5110.3%
 
6120.6%
 
7110.3%
 
ValueCountFrequency (%) 
33310.3%
 
30710.3%
 
30010.3%
 
23110.3%
 
8941.1%
 
7110.3%
 
6120.6%
 
5110.3%
 
4120.6%
 
3041.1%
 

TAVG
Real number (ℝ≥0)

Distinct count126
Unique (%)34.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean292.61878453038673
Minimum177.0
Maximum364.0
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB

Quantile statistics

Minimum177
5-th percentile236.05
Q1274
median295
Q3311.75
95-th percentile346.95
Maximum364
Range187
Interquartile range (IQR)37.75

Descriptive statistics

Standard deviation32.06241283
Coefficient of variation (CV)0.1095705899
Kurtosis0.7728884019
Mean292.6187845
Median Absolute Deviation (MAD)18
Skewness-0.4437145521
Sum105928
Variance1027.998317
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
307143.9%
 
31292.5%
 
30692.5%
 
30192.5%
 
28892.5%
 
30982.2%
 
29382.2%
 
29271.9%
 
27171.9%
 
27771.9%
 
30471.9%
 
30871.9%
 
29671.9%
 
31961.7%
 
29861.7%
 
27861.7%
 
26961.7%
 
28361.7%
 
29161.7%
 
32351.4%
 
29451.4%
 
28551.4%
 
28751.4%
 
30241.1%
 
31841.1%
 
Other values (101)19052.5%
 
ValueCountFrequency (%) 
17710.3%
 
18810.3%
 
19410.3%
 
20210.3%
 
20410.3%
 
20810.3%
 
21210.3%
 
21310.3%
 
21610.3%
 
22110.3%
 
ValueCountFrequency (%) 
36420.6%
 
36320.6%
 
36210.3%
 
36110.3%
 
36010.3%
 
35720.6%
 
35610.3%
 
35510.3%
 
35420.6%
 
35310.3%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexDatecountry_region_codecountry_regionstate_codessub_region_2retail_and_recreationgrocery_and_pharmacyparkstransit_stationsworkplacesresidentialconfirm_casesconfirm_mv3confirm_mv7confirm_mv14confirm_mv21confirm_mv28state_codeTMAXTMINPRCPTAVG
002020-03-14INIndiaTNNaN-4.01.01.0-3.01.03.01.0NaNNaNNaNNaNNaNTN332.0NaN0.0274.0
112020-03-15INIndiaTNNaN-8.03.0-3.0-7.01.03.00.0NaNNaNNaNNaNNaNTN332.0245.00.0288.0
222020-03-16INIndiaTNNaN-7.05.0-3.0-5.05.03.00.00.333333NaNNaNNaNNaNTNNaN250.00.0293.0
332020-03-17INIndiaTNNaN-8.07.0-1.0-5.0-5.04.00.00.000000NaNNaNNaNNaNTN343.0244.00.0293.0
442020-03-18INIndiaTNNaN-13.010.0-7.0-7.0-6.05.01.00.333333NaNNaNNaNNaNTN345.0250.00.0292.0
552020-03-19INIndiaTNNaN-17.07.0-10.0-12.0-7.07.01.00.666667NaNNaNNaNNaNTNNaN246.00.0297.0
662020-03-20INIndiaTNNaN-18.011.0-12.0-14.0-10.07.00.00.6666670.428571NaNNaNNaNTN352.0245.00.0292.0
772020-03-21INIndiaTNNaN-25.017.0-20.0-15.0-9.07.03.01.3333330.714286NaNNaNNaNTN345.0251.00.0294.0
882020-03-22INIndiaTNNaN-78.0-79.0-52.0-67.0-46.022.03.02.0000001.142857NaNNaNNaNTN344.0263.00.0293.0
992020-03-23INIndiaTNNaN-26.021.0-15.0-24.0-29.013.03.03.0000001.571429NaNNaNNaNTN345.0248.00.0288.0

Last rows

df_indexDatecountry_region_codecountry_regionstate_codessub_region_2retail_and_recreationgrocery_and_pharmacyparkstransit_stationsworkplacesresidentialconfirm_casesconfirm_mv3confirm_mv7confirm_mv14confirm_mv21confirm_mv28state_codeTMAXTMINPRCPTAVG
3523522020-05-04INIndiaKANaN-75.0-25.0-69.0-47.0-57.026.037.020.66666719.85714317.35714319.23809517.428571KA338.0230.00.0287.0
3533532020-05-05INIndiaKANaN-75.0-25.0-70.0-46.0-57.025.022.024.00000021.42857118.21428619.66666717.785714KA348.0234.00.0291.0
3543542020-05-06INIndiaKANaN-74.0-22.0-70.0-44.0-55.025.020.026.33333322.57142919.00000019.71428618.285714KA343.0NaN0.0272.0
3553552020-05-07INIndiaKANaN-74.0-21.0-69.0-45.0-55.026.012.018.00000020.00000018.57142918.57142918.142857KA336.0231.00.0269.0
3563562020-05-08INIndiaKANaN-74.0-18.0-69.0-45.0-53.025.048.026.66666723.42857119.92857118.76190519.500000KA325.0234.00.0277.0
3573572020-05-09INIndiaKANaN-75.0-22.0-72.0-45.0-43.021.041.033.66666727.57142921.00000019.52381020.678571KANaN238.00.0291.0
3583582020-05-10INIndiaKANaN-76.0-27.0-73.0-43.0-24.016.054.047.66666733.42857124.64285721.80952422.000000KA352.0240.00.0290.0
3593592020-05-11INIndiaKANaN-72.0-17.0-69.0-44.0-51.025.014.036.33333330.14285725.00000021.61904821.964286KA353.0236.00.0283.0
3603602020-05-12INIndiaKANaN-72.0-17.0-69.0-43.0-52.024.063.043.66666736.00000028.71428624.14285723.750000KA340.0234.00.0276.0
3613612020-05-13INIndiaKANaN-72.0-17.0-69.0-42.0-51.023.034.037.00000038.00000030.28571425.33333324.285714KA338.0224.00.0279.0